home *** CD-ROM | disk | FTP | other *** search
- # Message from: Ian Feldman, the Current Setext Oracle
- # Date: Sun, 16 Aug 92 08:19:00 +0100 (CET)
- # Reply-To: setext-list@random.se (Keepers of The Setext Flame[tm])
- # Replaces: setext_concepts_Mar92.etx
- # Lines: 240
- # Subject: setext_concepts_Aug92.etx
-
-
- Thank you for your interest in the setext format. Enclosed is an
- advance sheet that will remain in effect until the first public
- release of the setext format package (originally planned for around
- March 1st, 1992, now delayed).
-
- If you recognize some of the arguments presented here then that is
- the price that you are paying for having been an early bird. ;-))
- Please note that my email address may change in the near future;
- consult the trailer of weekly issues of TidBITS for the most
- current one.
-
-
- What is setext
- ----------------
- As originally explained in TidBITS#100 and mentioned there from
- now on, that publication now comes "wrapped as a setext." The noun
- itself stands for both a method to wrap (format) texts according
- to specific layout rules and for a single _structure_enhanced_
- text. The latter is a text which has been formatted in such a
- fashion that it contains clues as to the typographical and logical
- structure of its source (word-processed) document(s), if any.
- Those clues, which I call "typotags," facilitate later automatic
- detection of that structure so it can be validated and extracted/
- processed/ transformed/ enhanced as needed, if needed.
-
- It follows that setexts, being nothing but pure text (albeit with
- a special layout), are eminently readable using ANY editor or
- word processor in existence today or tommorrow, and not only on
- the Macintosh either. ANY computer, any computer program that is
- capable of opening and reading text files can be used for reading
- setexts. By default all properly setext-ized files will have an
- ".etx" or ".ETX" suffix. This stands for an "emailable/ enhanced
- text", the ExtraTerrestrial overtones nothwistanding ;-))
-
- Unlike other forms of text encoding that use explicit, visible tag
- elements such as <this> and <\that>, the setext format relies
- solely on the presence of _implicit_ typotags, carefully chosen
- to be as visually unobtrusive as possible. The underlined word
- above is one such instance of the defacto "invisible" coding.
- Inserted typotags will at worst appear as mere "typos" in the text.
-
- Similarly, just to give an example, here is a short description
- of the four types of word emphasis typotags that setexts MAY
- contain, limited to one emphasis type ONLY per word or word group:
-
- ------------------- ---------------------------- --------------
- **aBoldWord** **multiple bold words** ; bold-tt
- _anUnderlinedWord_ _multiple underlined words_ ; underline-tt
- ~anItalicWord~ ; italic-tt
- aHotWord_ multiple_hot_words_ ; hot-tt
- -----------------------------------------------------------------
- the 'hot-tt' is synonymous with the 'grouped' style of HyperCard
- only single ~italic~ words are allowed for visual-clarity reasons
-
- Please note, however, that the <end> strings previously found in
- TidBITS #100-110 were not part of the format as such, but were
- added by Adam Engst for a specific setext-raterrestrial purpose.
-
-
- Why is setext
- ---------------
- Data formats like the RTF (Rich Text Format) and SGML (Standard
- Graphic Markup Language) have been designed for processing ONLY
- by software. Setext, on the other hand, has been _optimized_
- for reading directly by human eyes on what probably is still the
- lowest common denominator of today's computer hardware, an 80-
- character by 24-line terminal screen (or, in effect, any computer
- screen). It follows that the format is intended chiefly for
- smaller texts, those of a size that a human reader might find
- within her capacity of overview.
-
- I need to state explicitly that although TidBITS is currently the
- only setext publication in wide distribution, the setext is NOT
- synonymous with that of TidBITS's layout. Many other distinctive
- layouts are possible. TidBITS is therefore just an _instance_ of
- the format, not THE setext format. More specifically, that also
- means that any of you thinking of writing a "TidBITS browser"
- should in reality be considering a "setext browser." Otherwise
- your program will in all probability be able to recognize only
- today's specifically-formatted TidBITS and no other future setext
- publications (which are in the making), including that of a future
- possibly changed or modified TidBITS.
-
-
- How come is setext
- --------------------
- The idea of a common format for online-distributed publications
- grew in my mind since approximately 1986-87. It came into focus
- after I started corresponding with Adam C. Engst, following my
- April, 1990 criticism of the original TidBITS presented as a
- HyperCard stack. Gradually it ceased to be a redesign effort for
- the TidBITS and became instead a generic format for all kinds of
- electronic publications (which I affectionately call "the compu-
- rags" ;-)). I hit on the current "tagless" version of the format
- in the winter of 1990 and the first internal beta product -- a
- setext encoder for TidBITS -- saw the light of the day in July of
- 1991. Later Adam wrote a setext-encoding Nisus macro for his
- personal use, the one he now uses to wrap the weekly issues of
- TidBITS (he isn't putting all those spaces and dashes in there
- entirely by hand! ;-))
-
- As can be seen from the above setext is not some quickie project,
- though up and finalized in a few afternoons. A lot of thought
- has gone into it and some of it has survived to the present day.
- Needless to say the format definition will be placed in the
- public domain and its use actively promoted by the many parties
- that have expressed an interest in adopting it for their own use.
-
-
- What for is setext
- --------------------
- The setext (data) format is intended primarily for use by online-
- distributed periodic publications. It is particularly well-suited
- to all kinds of electronic digests and other types of repetitively
- disseminated text information. Despite its formal appearance as
- "mere stream of unenhanced ASCII characters on a computer screen"
- setext is rich enough and unambiguous enough to permit construction
- of fairly complex encoding engines for specific application purposes
- (also on top of the format) and to allow easy implementation of a
- countless number of front-end browsers/ decoders and other
- reading/ archiving-enhancement tools.
-
- While setext does, indeed, allow the preservation of a source
- text's structure it does not, by definition, guarantee the 100%
- ability to recreate it at the destination. Any word originally
- styled as **bold** may in effect end up as Yellow-On-Black or be
- set in a different font, or considered a candidate for a
- cumulative keywords list or be deemphasized at will. There are
- not now and never will be any rules to govern how decoded setexts
- should be presented at the receiving end. It will be up to each
- front-end's author to ensure that decoded (no-longer-)setexts are
- presented in a fashion that's agreeable to his/ her end users.
- There is plenty of sound advice and recommendations on how to
- achieve that but that's an entirely different matter.
-
- Those principles also apply to decoding of a setext's logical,
- rather than merely its typographical, structure. The format does
- not rely on some large set of predefined, unambiguous, mutually-
- exclusive rules. Rather, it "knows of" just the barest set of
- typotags (1 required, 12 optional), knows their symbolic purpose
- and what criteria to use when looking for and validating them in
- a setext. This approach differs some from the commonly heard
- programmers' wish for clearly-delimited data patterns that could
- be scanned for quickly and their position used as an offset to
- the text to be displayed.
-
- Setext has those patterns too but, since it relies primarily on
- defacto "invisible" elements that could also be part of the text
- itself, it must validate them first before proceeding with any
- enhancements. Writing a real setext decoder is therefore
- conceptually much closer to (though nowhere near as hard as)
- writing an SGML application than it is to writing a macro routine
- to munge some data in one predefined fashion. In spite of all
- that, setext tools should be easily implementable with, and no
- more complex than, typical HyperTalk, sed, awk and perl scripts.
- The barest minimum required for such an attempt is an intelligent
- search/ replace function in a programmable macro editor. Though
- yet to be proven, conceptually there is nothing in the format to
- prevent implementation of real-time setext browsers written in,
- say, some advanced pattern-matching macro language of a terminal
- emulator program.
-
-
- Where is setext
- -----------------
- As of now (Aug '92) there is finally one validated setext browser
- in existence, the Easy View 2.1 application, written by Akif Eyler,
- free to distribute and use. Though not yet capable of decoding
- style-typotags nor unwrapping (reflowing) paragraphs of text, it
- is a very nice application all the same. It allows for cumulative
- indices of TidBITS or any other setext, and is capable of parsing
- and displaying a few other useful data formats as well (Info-Mac
- Digest among others). Adam Engst has uploaded it to CompuServe
- (MACSYS #6), ZiffNet/Mac (ZMC:DOWNTECH #0), America Online ("look
- for a new TidBITS library in the Hardware forum coming soon"), and
- the mother of all Macintosh archives, the <sumex-aim.stanford.edu>
- (look for it in /info-mac/digest/tb or /info-mac/app directories).
- It is, no doubt, already present at many private Bulletin-Board
- Systems etc. Read "TidBITS#136/03-Aug-92" for more information
- about the Easy View 2.1 and, if you use the application, do write
- Akif a note of appreciation.
-
- Other than that I have a working prototype of a setext front-end,
- which has been "not far from completion" for the last half year or
- so (draw your own conclusions). A paging macro routine for the
- rn, a popular newsreader under unix, allowing forward jumps to the
- next topic of any TidBITS read online in comp.sys.mac.digest group
- has been published in TidBITS#110/09-Mar-92. On top of that there
- is a mailing list for developers and future setext publishers:
- <setext-list@random.se>. If interested, please send me a short
- note stating degree of your future involvement (wants to write a
- setext tool or 'just an observer/ future user') and your Internet-
- accessible email address and I will put you on the list and/ or
- reply as soon as possible.
-
-
- When is setext
- ----------------
- Due to a varying work load and other distractions between the
- original announcement of the planned release and the actual date
- of it, the browser that I am writing is not yet ready. I do not
- intend to repeat the mistake of preannouncing it again. Instead
- please feel free to join the mailing list through which the rest
- of the specifications will be published. The full release will
- contain approximately 150K worth of setexts on setext along with
- a demo browser written in HyperCard (2.0) that will permit
- showing of the format's capabilities in a dynamic rather than
- the strictly textual and sequential fashion. Those of you who
- know me, know also of the high standards of coding that I try to
- adhere to.
-
- If you're among those that have already written a prototype
- that's based mainly on a reverse-engineered layout of the current
- TidBITS then you'd be well advised not to release it without prior
- validation of it by me. Please do not call your product a
- "setext browser" (or whatever) UNLESS it is truly capable of
- parsing all (future) setextized docs, not solely the TidBITS.
-
-
- How is setext
- ---------------
- A lot can (and will) be said about it but there is one claim no
- other text encoding method can make: "there is a lot more of me
- than meets the eye" ;-))
-
-
- Who is setext
- ---------------
- The setext format and its underlying philosophy isBroughtToYouBy
- Ian Feldman <ianf@random.se>. I live in Stockholm, Sweden, Europe.
- I used to work as and describe myself variously over the years
- but now simply contend myself with being just a free Human Factors
- thinker and tinkerer.
-
-
- .. last line contains a twodot-tt, a tag signifying the logic end of
- .. text while those three lines are all suppress-typotagged ones, i.e.
- .. can be suppressed (hidden) by a front-end application by default.
- ..
-
- This text is wrapped as a setext. For an index of informational
- files on our sponsors' products, or to submit comments or
- suggestions, please send email to <sponsors@tidbits.com>.
-
-
- -----------------------------------------------------------------------
- This information brought to you by the TidBITS Fileserver, conveniently
- located near you at <fileserver@tidbits.com>. To speak with a
- human, send email to Adam C. Engst at <ace@tidbits.com>. Enjoy!
-
-
-